248 research outputs found
Defense against Adversarial Attacks Using High-Level Representation Guided Denoiser
Neural networks are vulnerable to adversarial examples, which poses a threat
to their application in security sensitive systems. We propose high-level
representation guided denoiser (HGD) as a defense for image classification.
Standard denoiser suffers from the error amplification effect, in which small
residual adversarial noise is progressively amplified and leads to wrong
classifications. HGD overcomes this problem by using a loss function defined as
the difference between the target model's outputs activated by the clean image
and denoised image. Compared with ensemble adversarial training which is the
state-of-the-art defending method on large images, HGD has three advantages.
First, with HGD as a defense, the target model is more robust to either
white-box or black-box adversarial attacks. Second, HGD can be trained on a
small subset of the images and generalizes well to other images and unseen
classes. Third, HGD can be transferred to defend models other than the one
guiding it. In NIPS competition on defense against adversarial attacks, our HGD
solution won the first place and outperformed other models by a large margin
Effects of stator-rotor interaction on unsteady aerodynamic load of compressor rotor blades
In compressor working, unsteady aerodynamic load induced by the interaction of stator-rotor blade rows is the main vibration source of blade high cycle fatigue. It has a direct influence on fatigue strength of compressor blades. Further research on unsteady aerodynamic load has very important significance for improving the service life and reliability of compressor blades. Based on an aero-engine compressor rotor system, three-dimensional flow field model of former stator and downstream rotor is established. With the method of numerical simulation, compressor flow characteristics are solved at different moments. Then the paper analyzes the process of stator-rotor interaction and the distribution law of rotor blade aerodynamic load. In addition, the effects on rotor blade aerodynamic load are discussed at different pressure ratios, rotational speeds and ratios of stator-rotor blade number. The results show unsteady flow field area with lower speed is induced by stator-rotor interaction at rotor blade leading edge. When the overlap space between stator and rotor channels is the maximum, mass flow and static pressure around rotor blade will appear jumping fluctuations. Unsteady aerodynamic load fluctuates periodically, and dominant frequencies are manly at frequency doubling of stator-rotor interaction, especially at one time frequency (1×f0). In the interaction period T, variations of aerodynamic load on pressure and suction surfaces take the contrary trend, magnitude and pulsation amplitude on pressure surface are far greater than that on suction surface. Effects of pressure ratio on pressure and suction surfaces are consistent, and the magnitude of aerodynamic load increases with pressure ratio. Rotational speed and stator-rotor blade number ratio affect the magnitude of aerodynamic load on suction surface more heavily than that on pressure surface. With the increasing of rotational speed, unsteady characteristics of aerodynamic load are enhanced. Besides, pulsation amplitude and peak value of unsteady aerodynamic load reach the maximum when stator-rotor blade number ratio λ=1. This research provides the theoretical basis for dynamics design of aero-engine compressor rotor syste
Fast multiplication of random dense matrices with fixed sparse matrices
This work focuses on accelerating the multiplication of a dense random matrix
with a (fixed) sparse matrix, which is frequently used in sketching algorithms.
We develop a novel scheme that takes advantage of blocking and recomputation
(on-the-fly random number generation) to accelerate this operation. The
techniques we propose decrease memory movement, thereby increasing the
algorithm's parallel scalability in shared memory architectures. On the Intel
Frontera architecture, our algorithm can achieve 2x speedups over libraries
such as Eigen and Intel MKL on some examples. In addition, with 32 threads, we
can obtain a parallel efficiency of up to approximately 45%. We also present a
theoretical analysis for the memory movement lower bound of our algorithm,
showing that under mild assumptions, it's possible to beat the data movement
lower bound of general matrix-matrix multiply (GEMM) by a factor of ,
where is the cache size. Finally, we incorporate our sketching algorithm
into a randomized least squares solver. For extremely over-determined sparse
input matrices, we show that our results are competitive with SuiteSparse; in
some cases, we obtain a speedup of 10x over SuiteSparse
A distributed-memory parallel algorithm for discretized integral equations using Julia
Boundary value problems involving elliptic PDEs such as the Laplace and the
Helmholtz equations are ubiquitous in physics and engineering. Many such
problems have alternative formulations as integral equations that are
mathematically more tractable than their PDE counterparts. However, the
integral equation formulation poses a challenge in solving the dense linear
systems that arise upon discretization. In cases where iterative methods
converge rapidly, existing methods that draw on fast summation schemes such as
the Fast Multipole Method are highly efficient and well established. More
recently, linear complexity direct solvers that sidestep convergence issues by
directly computing an invertible factorization have been developed. However,
storage and compute costs are high, which limits their ability to solve
large-scale problems in practice. In this work, we introduce a
distributed-memory parallel algorithm based on an existing direct solver named
``strong recursive skeletonization factorization.'' The analysis of its
parallel scalability applies generally to a class of existing methods that
exploit the so-called strong admissibility. Specifically, we apply low-rank
compression to certain off-diagonal matrix blocks in a way that minimizes data
movement. Given a compression tolerance, our method constructs an approximate
factorization of a discretized integral operator (dense matrix), which can be
used to solve linear systems efficiently in parallel. Compared to iterative
algorithms, our method is particularly suitable for problems involving
ill-conditioned matrices or multiple right-hand sides. Large-scale numerical
experiments are presented to demonstrate the performance of our implementation
using the Julia language
OTOV2: Automatic, Generic, User-Friendly
The existing model compression methods via structured pruning typically
require complicated multi-stage procedures. Each individual stage necessitates
numerous engineering efforts and domain-knowledge from the end-users which
prevent their wider applications onto broader scenarios. We propose the second
generation of Only-Train-Once (OTOv2), which first automatically trains and
compresses a general DNN only once from scratch to produce a more compact model
with competitive performance without fine-tuning. OTOv2 is automatic and
pluggable into various deep learning applications, and requires almost minimal
engineering efforts from the users. Methodologically, OTOv2 proposes two major
improvements: (i) Autonomy: automatically exploits the dependency of general
DNNs, partitions the trainable variables into Zero-Invariant Groups (ZIGs), and
constructs the compressed model; and (ii) Dual Half-Space Projected Gradient
(DHSPG): a novel optimizer to more reliably solve structured-sparsity problems.
Numerically, we demonstrate the generality and autonomy of OTOv2 on a variety
of model architectures such as VGG, ResNet, CARN, ConvNeXt, DenseNet and
StackedUnets, the majority of which cannot be handled by other methods without
extensive handcrafting efforts. Together with benchmark datasets including
CIFAR10/100, DIV2K, Fashion-MNIST, SVNH and ImageNet, its effectiveness is
validated by performing competitively or even better than the
state-of-the-arts. The source code is available at
https://github.com/tianyic/only_train_once.Comment: Published on ICLR 2023. Remark here that a few images of dependency
graphs can not be included in arXiv due to exceeding size limi
- …